Search CORE

25 research outputs found

Explainable deep learning models for biological sequence classification

Author: Budach Stefan
Publication venue
Publication date: 01/01/2021
Field of study

Biological sequences - DNA, RNA and proteins - orchestrate the behavior of all living cells and trying to understand the mechanisms that govern and regulate the interactions among these molecules has motivated biological research for many years. The introduction of experimental protocols that analyze such interactions on a genome- or transcriptome-wide scale has also established the usage of machine learning in our field to make sense of the vast amounts of generated data. Recently, deep learning, a branch of machine learning based on artificial neural networks, and especially convolutional neural networks (CNNs) were shown to deliver promising results for predictive tasks and automated feature extraction. However, the resulting models are often very complex and thus make model application and interpretation hard, but the possibility to interpret which features a model has learned from the data is crucial to understand and to explain new biological mechanisms. This work therefore presents pysster, our open source software library that enables researchers to more easily train, apply and interpret CNNs on biological sequence data. We evaluate and implement different feature interpretation and visualization strategies and show that the flexibility of CNNs allows for the integration of additional data beyond pure sequences to improve the biological feature interpretability. We demonstrate this by building, among others, predictive models for transcription factor and RNA-binding protein binding sites and by supplementing these models with structural information in the form of DNA shape and RNA secondary structure. Features learned by models are then visualized as sequence and structure motifs together with information about motif locations and motif co-occurrence. By further analyzing an artificial data set containing implanted motifs we also illustrate how the hierarchical feature extraction process in a multi-layer deep neural network operates. Finally, we present a larger biological application by predicting RNA-binding of proteins for transcripts for which experimental protein-RNA interaction data is not yet available. Here, the comprehensive interpretation options of CNNs made us aware of potential technical bias in the experimental eCLIP data (enhanced crosslinking and immunoprecipitation) that were used as a basis for the models. This allowed for subsequent tuning of the models and data to get more meaningful predictions in practice

Institutional Repository of the Freie Universität Berlin

FUNDAMENTAL ADVERTISING PRINCIPLES

Author: Belka Claus
Budach Wilfried
Eldh L.
Eltzschig H.
Handrick Rene
Heinzelmann Frank
Jendrossek Verena
Uhlig Stefan
Velalakan A.
Publication venue
Publication date: 01/01/2007
Field of study

eLibrary National Mining University

A comparative study of machine learning methods for time-to-event survival data for radiomics risk modelling

Author: Budach Volker
Kotzerke Jörg
Leger Stefan
Linge Annett
Lohaus Fabian
Pilz Karoline
Schreiber Andreas
Tinhofer Inge
Zwanenburg Alex
Zöphel Klaus
Publication venue
Publication date: 01/01/2017
Field of study

Radiomics applies machine learning algorithms to quantitative imaging data to characterise the tumour phenotype and predict clinical outcome. For the development of radiomics risk models, a variety of different algorithms is available and it is not clear which one gives optimal results. Therefore, we assessed the performance of 11 machine learning algorithms combined with 12 feature selection methods by the concordance index (C-Index), to predict loco- regional tumour control (LRC) and overall survival for patients with head and neck squamous cell carcinoma. The considered algorithms are able to deal with continuous time-to-event survival data. Feature selection and model building were performed on a multicentre cohort (213 patients) and validated using an independent cohort (80 patients). We found several combinations of machine learning algorithms and feature selection methods which achieve similar results, e.g., MSR-RF: C-Index = 0.71 and BT-COX: C-Index = 0.70 in combination with Spearman feature selection. Using the best performing models, patients were stratified into groups of low and high risk of recurrence. Significant differences in LRC were obtained between both groups on the validation cohort. Based on the presented analysis, we identified a subset of algorithms which should be considered in future radiomics studies to develop stable and clinically relevant predictive models for time-to-event endpoints

Institutional Repository of the Freie Universität Berlin

Intensity modulated radiotherapy for high risk prostate cancer based on sentinel node SPECT imaging for target volume definition

BACKGROUND: The RTOG 94-13 trial has provided evidence that patients with high risk prostate cancer benefit from an additional radiotherapy to the pelvic nodes combined with concomitant hormonal ablation. Since lymphatic drainage of the prostate is highly variable, the optimal target volume definition for the pelvic lymph nodes is problematic. To overcome this limitation, we tested the feasibility of an intensity modulated radiation therapy (IMRT) protocol, taking under consideration the individual pelvic sentinel node drainage pattern by SPECT functional imaging. METHODS: Patients with high risk prostate cancer were included. Sentinel nodes (SN) were localised 1.5–3 hours after injection of 250 MBq (99m)Tc-Nanocoll using a double-headed gamma camera with an integrated X-Ray device. All sentinel node localisations were included into the pelvic clinical target volume (CTV). Dose prescriptions were 50.4 Gy (5 × 1.8 Gy / week) to the pelvis and 70.0 Gy (5 × 2.0 Gy / week) to the prostate including the base of seminal vesicles or whole seminal vesicles. Patients were treated with IMRT. Furthermore a theoretical comparison between IMRT and a three-dimensional conformal technique was performed. RESULTS: Since 08/2003 6 patients were treated with this protocol. All patients had detectable sentinel lymph nodes (total 29). 4 of 6 patients showed sentinel node localisations (total 10), that would not have been treated adequately with CT-based planning ('geographical miss') only. The most common localisation for a probable geographical miss was the perirectal area. The comparison between dose-volume-histograms of IMRT- and conventional CT-planning demonstrated clear superiority of IMRT when all sentinel lymph nodes were included. IMRT allowed a significantly better sparing of normal tissue and reduced volumes of small bowel, large bowel and rectum irradiated with critical doses. No gastrointestinal or genitourinary acute toxicity Grade 3 or 4 (RTOG) occurred. CONCLUSION: IMRT based on sentinel lymph node identification is feasible and reduces the probability of a geographical miss. Furthermore, IMRT allows a pronounced sparing of normal tissue irradiation. Thus, the chosen approach will help to increase the curative potential of radiotherapy in high risk prostate cancer patients

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Predictive modeling of long non-coding RNA chromatin (dis-)association

Author: Budach Stefan
Marsico Annalisa
Ntini Evgenia
Vang Ørom Ulf A
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 17/12/2020
Field of study

Long non-coding RNAs (lncRNAs) are involved in gene expression regulation in cis and trans. Although enriched in the chromatin cell fraction, to what degree this defines their broad range of functions remains unclear. In addition, the factors that contribute to lncRNA chromatin tethering, as well as the molecular basis of efficient lncRNA chromatin dissociation and its functional impact on enhancer activity and target gene expression, remain to be resolved. Here, we combine pulse-chase metabolic labeling of nascent RNA with chromatin fractionation and transient transcriptome sequencing to follow nascent RNA transcripts from their co-transcriptional state to their release into the nucleoplasm. By incorporating functional and physical characteristics in machine learning models, we find that parameters like co-transcriptional splicing contributes to efficient lncRNA chromatin dissociation. Intriguingly, lncRNAs transcribed from enhancer-like regions display reduced chromatin retention, suggesting that, in addition to splicing, lncRNA chromatin dissociation may contribute to enhancer activity and target gene expression

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading

Author: Budach Stefan
Costanza Pascal
Ehrhardt Marcel
Hancox Jonny
Rahn René
Reinert Knut
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/10/2018
Field of study

Motivation Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. In this paper, we present a generically accelerated module for pairwise sequence lignments applicable for a broad range of applications. In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter-sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (Single Instruction Multiple Data) instructions of modern processors. We then extended the module by adding two layers of thread-level parallelization, where we a) distribute many independent alignments on multiple threads and b) inherently parallelize a single alignment computation using a work stealing approach producing a dynamic wavefront progressing along the minor diagonal. Results We evaluated our alignment vectorization and parallelization on different processors, including the newest Intel® Xeon® (Skylake) and Intel® Xeon Phi™ (KNL) processors, and use cases. The instruction set AVX512-BW (Byte and Word), available on Skylake processors, can genuinely improve the performance of vectorized alignments. We could run single alignments 1600 times faster on the Xeon Phi™ and 1400 times faster on the Xeon® than executing them with our previous sequential alignment module. Availability The module is programmed in C++ using the SeqAn (Reinert et al., 2017) library and distributed with version 2.4. under the BSD license. We support SSE4, AVX2, AVX512 instructions and included UME::SIMD, a SIMD-instruction wrapper library, to extend our module for further instruction sets. We thoroughly test all alignment components with all major C++ compilers on various platforms

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Advanced Data Processing – The Cornerstone of Efficient GCxGC/TOF MS Method Development

Author: Belka Claus
Budach Wilfried
Junginger Dorothea
Marini Patrizia
Niyazi Maximilian
Stickl Stefan
Publication venue
Publication date: 01/01/2009
Field of study

Purpose: The combination of ionizing radiation with the pro-apoptotic TRAIL receptor antibody lexatumumab has been shown to exert considerable synergistic apoptotic effects in vitro and in short term growth delay assays. To clarify the relevance of these effects on local tumour control long-term experiments using a colorectal xenograft model were conducted. Materials and methods: Colo205-xenograft bearing NMRI (nu/nu) nude mice were treated with fractionated irradiation (5 x 3 Gy, d1-5) and lexatumumab (0.75 mg/kg, d1, 4 and 8). The tumour bearing hind limbs were irradiated with graded single top up doses at d8 under normoxic (ambient) and acute hypoxic (clamped) conditions. Experimental animals were observed for 270 days. Growth delay and local tumour control were end points of the study. Statistical analysis of the experiments included evaluation of tumour regrowth and local tumour control. Results: Combined treatment with irradiation and lexatumumab led to a pronounced tumour regrowth-delay when compared to irradiation alone. The here presented long-term experiments revealed a highly significant rise of local tumour control for normoxic (ambient) (p = 0.000006) and hypoxic treatment (p = 0.000030). Conclusion: Our data show that a combination of the pro-apoptotic antibody lexatumumab with irradiation reduces tumour regrowth and leads to a highly increased local tumour control in a nude mouse model. This substantial effect was observed under ambient and more pronounced under hypoxic conditions

Crossref

Directory of Open Access Journals

Open Access LMU

PubMed Central

Open Repository and Bibliography - Liège

Thermoradiotherapy with interstitial thermoseeds for localised prostate cancer

Author: Dirk Böhmer
Ingolf Türk
Jan Roigas
Serdar Deger
Stefan Loening
Volker Budach
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Evaluation of prognostic factors and role of participation in a randomized trial or a prospective registry in pediatric and adolescent nonmetastatic medulloblastoma: a report from the HIT 2000 trial

Purpose We aimed to compare treatment results in and outside of a randomized trial and to confirm factors influencing outcome in a large retrospective cohort of nonmetastatic medulloblastoma treated in Austria, Switzerland and Germany. Methods and Materials Patients with nonmetastatic medulloblastoma (n = 382) aged 4 to 21 years and primary neurosurgical resection between 2001 and 2011 were assessed. Between 2001 and 2006, 176 of these patients (46.1%) were included in the randomized HIT SIOP PNET 4 trial. From 2001 to 2011 an additional 206 patients were registered to the HIT 2000 study center and underwent the identical central review program. Three different radiation therapy protocols were applied. Genetically defined tumor entity (former molecular subgroup) was available for 157 patients. Results Median follow-up time was 7.3 (range, 0.09-13.86) years. There was no difference between HIT SIOP PNET 4 trial patients and observational patients outside the randomized trial, with 7 years progression-free survival rates (PFS) of 79.5% ± 3.1% versus 78.7% ± 3.1% (P = .62). On univariate analysis, the time interval between surgery and irradiation (≤ 48 days vs ≥ 49 days) showed a strong trend to affect PFS (80.4% ± 2.2% vs 64.6% ± 9.1%; P = .052). Furthermore, histologically and genetically defined tumor entities and the extent of postoperative residual tumor influenced PFS. On multivariate analyses, a genetically defined tumor entity wingless-related integration site-activated vs non-wingless-related integration site/non-SHH, group 3 hazard ratio, 5.49; P = .014) and time interval between surgery and irradiation (hazard ratio, 2.2; P = .018) were confirmed as independent risk factors. Conclusions Using a centralized review program and risk-stratified therapy for all patients registered to the study center, outcome was identical for patients with nonmetastatic medulloblastoma treated on and off the randomized HIT SIOP PNET 4 trial. The prognostic values of prolonged time to RT and genetically defined tumor entity were confirmed

OPUS Augsburg

Kölner UniversitätsPublikationsServer

ZORA

Archive ouverte UNIGE

Assessment and safety of operation of oil and gas pipelines in non-steady conditions of technological parameters

Author: Belka Claus
Boras Ruzica
Budach Wilfried
Eldh Therese
Eltzschig Holger K.
Handrick Rene
Heinzelmann Frank
Henkel Marco
Jendrossek Verena
Köhler David
Lauber Kirsten
Martin Christian
Nowak Kerstin
Uhlig Stefan
Wehrmann Manfred
Publication venue: Томский политехнический университет
Publication date: 01/01/2006
Field of study

Актуальность. Изменения технологических параметров перекачки продукта в процессе эксплуатации нефтегазопроводов по сравнению со стационарными условиями работы приводят к возникновению дополнительных механических напряжений в стенке труб и к снижению запасов прочности. При этом заданный в стадии проектирования ресурс трубопроводов изменяется в сторону уменьшения. Возрастает риск возникновения аварийных ситуаций. Это обуславливает необходимость разработки методов оценки и обеспечения безопасности нефтегазопроводов в условиях нестационарности технологических параметров эксплуатации. Цель исследования: оценить и обеспечить безопасность эксплуатации нефтегазопроводов при нестационарности технологических параметров перекачки. Объект исследования: трубопроводная система нефтегазовой отрасли. Методы: теоретические исследования безопасности эксплуатации нефтегазопроводов в условиях нестационарности технологических параметров режима перекачки. Результаты. Получены аналитические зависимости запасов прочности трубопроводов от параметров нестационарности режима перекачки. Даны рекомендации по обеспечению безопасности нефтегазопроводов в условиях нестационарности технологических параметров эксплуатации. Выводы. В условиях нестанционарности технологических параметров эксплуатации нефтегазопроводов в стенке их труб возникают повышенные механические напряжения, снижающие безопасность и ресурс сооружения. При одинаковых условиях нагружения внутренним давлением наибольшие напряжения возникают в сечениях соединения трубопровода с оборудованием, имеющим абсолютную жесткость на деформацию. Снижение уровня механических напряжений в стенке труб обеспечивается плавным регулированием режима перекачки, которое реализуется на нефтепроводах с помощью магистральных насосов, оснащенных частотно-регулируемым электроприводом. Обеспечение безопасности эксплуатации нефтегазопроводов в условиях нестационарности технологических параметров перекачки может быть достигнуто регулированием режима перекачки продукта перекачивающими агрегатами, оснащенными регулируемым приводом.The relevance. Changes in technological parameters of product pumping at oil and gas pipelines operation in comparison with the stationary operating condition leads to appearance of additional mechanical stresses in the wall of pipes and to decrease in margin of safety. At the same time, the pipeline resource specified in the project changes to decrease. The risk of failures increases. This substantiates the development of methods for assessing and ensuring the safety of oil and gas pipelines in conditions of non-stationarity of technological parameters of operation. The main aim of the research is to assess and ensure the safety of operation of oil and gas pipelines at non-stationarity of pumping technological parameters. Object: pipeline system of oil and gas industry. Methods: theoretical studies of oil and gas pipeline operation safety in conditions of non-stationarity of technological parameters of the pumping regime. Results. The authors have obtained the analytical dependences of pipelines strength on parameters of non-stationarity of pumping regime and recommended to ensure safety of oil and gas pipelines in conditions of non-stationarity of technological operating parameters. Conclusions. In non-stationarity conditions of oil and gas pipelines technological parameters while operation, in the wall of their pipes, the increased mechanical stresses occur that reduce the safety and life of the structure. Under the same conditions of inner pressure, the highest stresses arise in section of pipeline connection with equipment which have absolute rigidity for deformation. Mechanical stresses reduction in pipeline wall is provided by pumping regime smooth regulation, which is realized on oil pipelines by means of the main pumps, equipped with frequency-regulated electric drive. The pipelines exploitation safety in non-stationarity conditions of pumping technological parameters can be reached by pumping regime regulation with use of frequency-regulated electric drive

Electronic archive of Tomsk Polytechnic University